Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model
نویسندگان
چکیده
MOTIVATION It is a common task in genomic studies to identify a subset of the genes satisfying certain conditions, such as differentially expressed genes or regulatory target genes of a transcription factor (TF). This can be formulated as a statistical hypothesis testing problem. Most existing approaches treat the genes as having an identical and independent distribution a priori, testing each gene independently or testing some subsets of the genes one by one. On the other hand, it is known that the genes work coordinately as dictated by gene networks. Treating genes equally and independently ignores the important information contained in gene networks, leading to inefficient analysis and reduced power. RESULTS We propose incorporating gene network information into statistical analysis of genomic data. Specifically, rather than treating the genes equally and independently a priori in a standard mixture model, we assume that gene-specific prior probabilities are correlated as induced by a gene network: while the genes are allowed to have different prior probabilities, those neighboring ones in the network have similar prior probabilities, reflecting their shared biological functions. We applied the two approaches to a real ChIP-chip dataset (and simulated data) to identify the transcriptional target genes of TF GCN4. The new method was found to be more powerful in discovering the target genes.
منابع مشابه
Spatially Correlated Mixture Models with Application in Genomic Hypothesis Testing
An important task in genomic studies is the detection of genes satisfying certain conditions, such as being regulatory targets of a transcription factor (TF). With high-throughput data, e.g. microarray data, this is usually formulated as a simultaneous hypothesis testing problem. Gaussian mixture model (GMM) can be used in such a problem. However, standard GMM assumes that all the genes have an...
متن کاملDetection of gene copy number changes in CGH microarrays using a spatially correlated mixture model
MOTIVATION Comparative genomic hybridization array experiments that investigate gene copy number changes present new challenges for statistical analysis and call for methods that incorporate spatial dependence between sequences along the chromosome. For this purpose, we propose a novel method called CGHmix. It is based on a spatially structured mixture model with three states corresponding to g...
متن کاملNetwork-based genomic discovery: application and comparison of Markov random field models.
As biological knowledge accumulates rapidly, gene networks encoding genome-wide gene-gene interactions have been constructed. As an improvement over the standard mixture model that tests all the genes iid a priori, Wei and Li (2007) and Wei and Pan (2008) proposed modeling a gene network as a Discrete- or Gaussian-Markov random field (DMRF or GMRF) respectively in a mixture model to analyze gen...
متن کاملBayesian Joint Modeling of Multiple Gene Networks and Diverse Genomic Data to Identify Target Genes of a Transcription Factor.
We consider integrative modeling of multiple gene networks and diverse genomic data, including protein-DNA binding, gene expression and DNA sequence data, to accurately identify the regulatory target genes of a transcription factor (TF). Rather than treating all the genes equally and independently a priori in existing joint modeling approaches, we incorporate the biological prior knowledge that...
متن کاملSpatial Beta Regression Model with Random Effect
Abstract: In many applications we have to encountered with bounded dependent variables. Beta regression model can be used to deal with these kinds of response variables. In this paper we aim to study spatially correlated responses in the unit interval. Initially we introduce spatial beta generalized linear mixed model in which the spatial correlation is captured through a random effect. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 24 3 شماره
صفحات -
تاریخ انتشار 2008